Development of a WFST based Speech Recognition System for a Resource Deficient Language Using Machine Translation

نویسندگان

Arnar Thor Jensson

Tasuku Oonishi

Koji Iwano

Sadaoki Furui

چکیده

Text corpus size is an important issue when building a language model (LM) in particular where insufficient training and evaluation data are available. In this paper we continue our work on creating a speech recognition system with a LM that is trained on a small amount of text in the target language. In order to get better performance we use a large amount of foreign text and a dictionary mapping between the languages. A dictionary is used since we are assuming that the target language is resource deficient and therefore statistical machine translation (MT) is not available. In this paper we take a step forward from our previous published method by using a coupling of the speech recognition part and the translation part rather than pre-translating the foreign text. The coupling is achieved with a weighted finite state transducer (WFST ) network which as well makes it possible to easily switch between the output language, i.e. that the output text is in the format of the resource deficient language or in the resource rich language. Our method outperforms the resource-deficient Icelandic speech recognition baseline, 82.6% keyword accuracy (KA), when the system is trained on 1500 Icelandic sentences, both for the English output (2.6% absolute KA improvement) and for the Icelandic output (1.6% absolute KA improvement) where the English text corpus consists of 63003 sentences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Language model adaptation using WFST-based speaking-style translation

This paper describes a new approach to language model adaptation for speech recognition based on the statistical framework of speech translation. The main idea of this approach is to compose a weighted finite-state transducer (WFST) that translates sentence styles from in-domain to out-of-domain. It enables to integrate language models of different styles of speaking or dialects and even of dif...

متن کامل

A WFST-based log-linear framework for speaking-style transformation

●Objective: Transform spoken-style language (V) into written style language (W) for the creation of transcripts ●Approach: Statistical machine translation to “translate” from verbatim text to written text ●Innovations: ●Log-linear modeling for improved accuracy ●Introduction of features to handle common phenomena in speaking-style transformation ●WFST-based implementation for integration with W...

متن کامل

Spoken Language Processing Using Weighted Finite State Transducers

The main goal of this paper is to illustrate the advantages of weighted finite state transducers (WFSTs) for spoken language processing, namely in terms of their capacity to efficiently integrate different types of knowledge sources. We shall illustrate their applicability in several areas: large vocabulary continuous speech recognition, automatic alignment using pronunciation modeling rules, g...

متن کامل

Minimum Bayes-Risk Techniques in Automatic Speech Recognition and Statistical Machine Translation

Automatic Speech Recognition (ASR) and Machine Translation (MT) are fundamental language technologies that are emerging as core components of information processing systems. Each of these problems can be evaluated using a variety of metrics that measure different aspects of recognition or translation performance. In contrast, the training and decoding architectures of most of the current ASR an...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Development of a WFST based Speech Recognition System for a Resource Deficient Language Using Machine Translation

نویسندگان

چکیده

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Language model adaptation using WFST-based speaking-style translation

A WFST-based log-linear framework for speaking-style transformation

Spoken Language Processing Using Weighted Finite State Transducers

Minimum Bayes-Risk Techniques in Automatic Speech Recognition and Statistical Machine Translation

عنوان ژورنال:

اشتراک گذاری